AITopics | wishart distribution

Collaborating Authors

wishart distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A variational approximate posterior for the deep Wishart process

Neural Information Processing SystemsOct-3-2025, 09:08:16 GMT

Recent work introduced deep kernel processes as an entirely kernel-based alternative to NNs (Aitchison et al. 2020).

artificial intelligence, machine learning, posterior, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Thinning a Wishart Random Matrix

Dharamshi, Ameer, Neufeld, Anna, Gao, Lucy L., Witten, Daniela, Bien, Jacob

arXiv.org Machine LearningFeb-14-2025

Recent work has explored data thinning, a generalization of sample splitting that involves decomposing a (possibly matrix-valued) random variable into independent components. In the special case of a $n \times p$ random matrix with independent and identically distributed $N_p(\mu, \Sigma)$ rows, Dharamshi et al. (2024a) provides a comprehensive analysis of the settings in which thinning is or is not possible: briefly, if $\Sigma$ is unknown, then one can thin provided that $n>1$. However, in some situations a data analyst may not have direct access to the data itself. For example, to preserve individuals' privacy, a data bank may provide only summary statistics such as the sample mean and sample covariance matrix. While the sample mean follows a Gaussian distribution, the sample covariance follows (up to scaling) a Wishart distribution, for which no thinning strategies have yet been proposed. In this note, we fill this gap: we show that it is possible to generate two independent data matrices with independent $N_p(\mu, \Sigma)$ rows, based only on the sample mean and sample covariance matrix. These independent data matrices can either be used directly within a train-test paradigm, or can be used to derive independent summary statistics. Furthermore, they can be recombined to yield the original sample mean and sample covariance.

artificial intelligence, machine learning, matrix, (16 more...)

arXiv.org Machine Learning

2502.09957

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Elliptical Wishart distributions: information geometry, maximum likelihood estimator, performance analysis and statistical learning

Ayadi, Imen, Bouchard, Florent, Pascal, Frédéric

arXiv.org Machine LearningNov-4-2024

This paper deals with Elliptical Wishart distributions - which generalize the Wishart distribution - in the context of signal processing and machine learning. Two algorithms to compute the maximum likelihood estimator (MLE) are proposed: a fixed point algorithm and a Riemannian optimization method based on the derived information geometry of Elliptical Wishart distributions. The existence and uniqueness of the MLE are characterized as well as the convergence of both estimation algorithms. Statistical properties of the MLE are also investigated such as consistency, asymptotic normality and an intrinsic version of Fisher efficiency. On the statistical learning side, novel classification and clustering methods are designed. For the $t$-Wishart distribution, the performance of the MLE and statistical learning algorithms are evaluated on both simulated and real EEG and hyperspectral data, showcasing the interest of our proposed methods.

likelihood estimator, maximum likelihood estimator, wishart distribution, (14 more...)

arXiv.org Machine Learning

2411.02726

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.76)

Add feedback

Impossibility of latent inner product recovery via rate distortion

Mao, Cheng, Zhang, Shenduo

arXiv.org Machine LearningJul-16-2024

In this largely expository note, we present an impossibility result for inner product recovery in a random geometric graph or latent space model using the rate-distortion theory. More precisely, suppose that we observe a graph $A$ on $n$ vertices with average edge density $p$ generated from Gaussian or spherical latent locations $z_1, \dots, z_n \in \mathbb{R}^d$ associated with the $n$ vertices. It is of interest to estimate the inner products $\langle z_i, z_j \rangle$ which represent the geometry of the latent points. We prove that it is impossible to recover the inner products if $d \gtrsim n h(p)$ where $h(p)$ is the binary entropy function. This matches the condition required for positive results on inner product recovery in the literature. The proof follows the well-established rate-distortion theory with the main technical ingredient being a lower bound on the rate-distortion function of the Wishart distribution which is interesting in its own right.

graph, matrix completion, rate-distortion function, (14 more...)

arXiv.org Machine Learning

2407.11932

Country:

North America > United States (0.15)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Lower Bounds for Private Estimation of Gaussian Covariance Matrices under All Reasonable Parameter Regimes

Portella, Victor S., Harvey, Nick

arXiv.org Machine LearningApr-26-2024

We prove lower bounds on the number of samples needed to privately estimate the covariance matrix of a Gaussian distribution. Our bounds match existing upper bounds in the widest known setting of parameters. Our analysis relies on the Stein-Haff identity, an extension of the classical Stein's identity used in previous fingerprinting lemma arguments.

gradient, matrix, wishart distribution, (16 more...)

arXiv.org Machine Learning

2404.17714

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > British Columbia (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Bayesian Partitioning of Large-Scale Distance Data

Neural Information Processing SystemsMar-15-2024, 00:47:27 GMT

A Bayesian approach to partitioning distance matrices is presented. It is inspired by the Translation-invariant Wishart-Dirichlet process (TIWD) in [1] and shares a number of advantageous properties like the fully probabilistic nature of the inference model, automatic selection of the number of clusters and applicability in semi-supervised settings. In addition, our method (which we call fastTIWD) overcomes the main shortcoming of the original TIWD, namely its high computational costs.

matrix, partition, wishart distribution, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Better and Simpler Lower Bounds for Differentially Private Statistical Estimation

Narayanan, Shyam

arXiv.org Artificial IntelligenceJan-4-2024

We provide optimal lower bounds for two well-known parameter estimation (also known as statistical estimation) tasks in high dimensions with approximate differential privacy. First, we prove that for any $\alpha \le O(1)$, estimating the covariance of a Gaussian up to spectral error $\alpha$ requires $\tilde{\Omega}\left(\frac{d^{3/2}}{\alpha \varepsilon} + \frac{d}{\alpha^2}\right)$ samples, which is tight up to logarithmic factors. This result improves over previous work which established this for $\alpha \le O\left(\frac{1}{\sqrt{d}}\right)$, and is also simpler than previous work. Next, we prove that estimating the mean of a heavy-tailed distribution with bounded $k$th moments requires $\tilde{\Omega}\left(\frac{d}{\alpha^{k/(k-1)} \varepsilon} + \frac{d}{\alpha^2}\right)$ samples. Previous work for this problem was only able to establish this lower bound against pure differential privacy, or in the special case of $k = 2$. Our techniques follow the method of fingerprinting and are generally quite simple. Our lower bound for heavy-tailed estimation is based on a black-box reduction from privately estimating identity-covariance Gaussians. Our lower bound for covariance estimation utilizes a Bayesian approach to show that, under an Inverse Wishart prior distribution for the covariance matrix, no private estimator can be accurate even in expectation, without sufficiently many samples.

covariance estimation, differential privacy, estimation, (15 more...)

arXiv.org Artificial Intelligence

2310.06289

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

K-Tensors: Clustering Positive Semi-Definite Matrices

Zhang, Hanchao, Tarpey, Thaddeus

arXiv.org Artificial IntelligenceJul-18-2023

This paper introduces a novel self-consistency clustering algorithm ($K$-Tensors) designed for {partitioning a distribution of} positive-semidefinite matrices based on their eigenstructures. As positive semi-definite matrices can be represented as ellipsoids in $\mathbb R^p$, $p \ge 2$, it is critical to maintain their structural information to perform effective clustering. However, traditional clustering algorithms {applied to matrices} often {involve vectorization of} the matrices, resulting in a loss of essential structural information. To address this issue, we propose a distance metric {for clustering} that is specifically based on the structural information of positive semi-definite matrices. This distance metric enables the clustering algorithm to consider the differences between positive semi-definite matrices and their projections onto {a} common space spanned by \thadJulyTen{orthonormal vectors defined from a set of} positive semi-definite matrices. This innovative approach to clustering positive semi-definite matrices has broad applications in several domains including financial and biomedical research, such as analyzing functional connectivity data. By maintaining the structural information of positive semi-definite matrices, our proposed algorithm promises to cluster the positive semi-definite matrices in a more meaningful way, thereby facilitating deeper insights into the underlying data in various applications.

artificial intelligence, machine learning, matrix, (16 more...)

arXiv.org Artificial Intelligence

2306.06534

Country: North America > United States > New York (0.04)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

An Improved Variational Approximate Posterior for the Deep Wishart Process

Ober, Sebastian, Anson, Ben, Milsom, Edward, Aitchison, Laurence

arXiv.org Artificial IntelligenceMay-23-2023

Deep kernel processes are a recently introduced class of deep Bayesian models that have the flexibility of neural networks, but work entirely with Gram matrices. They operate by alternately sampling a Gram matrix from a distribution over positive semi-definite matrices, and applying a deterministic transformation. When the distribution is chosen to be Wishart, the model is called a deep Wishart process (DWP). This particular model is of interest because its prior is equivalent to a deep Gaussian process (DGP) prior, but at the same time it is invariant to rotational symmetries, leading to a simpler posterior distribution. Practical inference in the DWP was made possible in recent work ("A variational approximate posterior for the deep Wishart process" Ober and Aitchison 2021a) where the authors used a generalisation of the Bartlett decomposition of the Wishart distribution as the variational approximate posterior. However, predictive performance in that paper was less impressive than one might expect, with the DWP only beating a DGP on a few of the UCI datasets used for comparison. In this paper, we show that further generalising their distribution to allow linear combinations of rows and columns in the Bartlett decomposition results in better predictive performance, while incurring negligible additional computation cost.

approximate posterior, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.14454

Country: